AITopics | transfer matrix

Collaborating Authors

transfer matrix

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SP2RINT: Spatially-Decoupled Physics-Inspired Progressive Inverse Optimization for Scalable, PDE-Constrained Meta-Optical Neural Network Training

Ma, Pingchuan, Yin, Ziang, Jing, Qi, Gao, Zhengqi, Gangi, Nicholas, Zhang, Boyang, Huang, Tsung-Wei, Huang, Zhaoran, Boning, Duane S., Yao, Yu, Gu, Jiaqi

arXiv.org Artificial IntelligenceMay-30-2025

DONNs leverage light propagation for efficient analog AI and signal processing. Advances in nanophotonic fabrication and metasurface-based wavefront engineering have opened new pathways to realize high-capacity DONNs across various spectral regimes. Training such DONN systems to determine the metasurface structures remains challenging. Heuristic methods are fast but oversimplify metasurfaces modulation, often resulting in physically unrealizable designs and significant performance degradation. Simulation-in-the-loop optimizes implementable metasurfaces via adjoint methods, but is computationally prohibitive and unscalable. To address these limitations, we propose SP2RINT, a spatially decoupled, progressive training framework that formulates DONN training as a PDE-constrained learning problem. Metasurface responses are first relaxed into freely trainable transfer matrices with a banded structure. We then progressively enforce physical constraints by alternating between transfer matrix training and adjoint-based inverse design, avoiding per-iteration PDE solves while ensuring final physical realizability. To further reduce runtime, we introduce a physics-inspired, spatially decoupled inverse design strategy based on the natural locality of field interactions. This approach partitions the metasurface into independently solvable patches, enabling scalable and parallel inverse design with system-level calibration. Evaluated across diverse DONN training tasks, SP2RINT achieves digital-comparable accuracy while being 1825 times faster than simulation-in-the-loop approaches. By bridging the gap between abstract DONN models and implementable photonic hardware, SP2RINT enables scalable, high-performance training of physically realizable meta-optical neural systems. Our code is available at https://github.com/ScopeX-ASU/SP2RINT

artificial intelligence, machine learning, transfer matrix, (17 more...)

arXiv.org Artificial Intelligence

2505.18377

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

A Centralized-Distributed Transfer Model for Cross-Domain Recommendation Based on Multi-Source Heterogeneous Transfer Learning

Xu, Ke, Wang, Ziliang, Zheng, Wei, Ma, Yuhao, Wang, Chenglin, Jiang, Nengxue, Cao, Cai

arXiv.org Artificial IntelligenceNov-14-2024

Cross-domain recommendation (CDR) methods are proposed to tackle the sparsity problem in click through rate (CTR) estimation. Existing CDR methods directly transfer knowledge from the source domains to the target domain and ignore the heterogeneities among domains, including feature dimensional heterogeneity and latent space heterogeneity, which may lead to negative transfer. Besides, most of the existing methods are based on single-source transfer, which cannot simultaneously utilize knowledge from multiple source domains to further improve the model performance in the target domain. In this paper, we propose a centralized-distributed transfer model (CDTM) for CDR based on multi-source heterogeneous transfer learning. To address the issue of feature dimension heterogeneity, we build a dual embedding structure: domain specific embedding (DSE) and global shared embedding (GSE) to model the feature representation in the single domain and the commonalities in the global space,separately. To solve the latent space heterogeneity, the transfer matrix and attention mechanism are used to map and combine DSE and GSE adaptively. Extensive offline and online experiments demonstrate the effectiveness of our model.

recommendation, source domain, target domain, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICDM54844.2022.00166

2411.09286

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.61)

Add feedback

What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents

Jin, Mingyu, Wang, Beichen, Xue, Zhaoqian, Zhu, Suiyuan, Hua, Wenyue, Tang, Hua, Mei, Kai, Du, Mengnan, Zhang, Yongfeng

arXiv.org Artificial IntelligenceFeb-20-2024

In this study, we introduce "CosmoAgent," an innovative artificial intelligence framework utilizing Large Language Models (LLMs) to simulate complex interactions between human and extraterrestrial civilizations, with a special emphasis on Stephen Hawking's cautionary advice about not sending radio signals haphazardly into the universe. The goal is to assess the feasibility of peaceful coexistence while considering potential risks that could threaten well-intentioned civilizations. Employing mathematical models and state transition matrices, our approach quantitatively evaluates the development trajectories of civilizations, offering insights into future decision-making at critical points of growth and saturation. Furthermore, the paper acknowledges the vast diversity in potential living conditions across the universe, which could foster unique cosmologies, ethical codes, and worldviews among various civilizations. Recognizing the Earth-centric bias inherent in current LLM designs, we propose the novel concept of using LLMs with diverse ethical paradigms and simulating interactions between entities with distinct moral principles. This innovative research provides a new way to understand complex inter-civilizational dynamics, expanding our perspective while pioneering novel strategies for conflict resolution, crucial for preventing interstellar conflicts. We have also released the code and datasets to enable further academic investigation into this interesting area of research.

agent, civilization, interaction, (14 more...)

arXiv.org Artificial Intelligence

2402.13184

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)
Europe > Portugal (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.88)
Research Report > New Finding (0.66)

Industry: Government (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

M3ICRO: Machine Learning-Enabled Compact Photonic Tensor Core based on PRogrammable Multi-Operand Multimode Interference

Gu, Jiaqi, Zhu, Hanqing, Feng, Chenghao, Jiang, Zixuan, Chen, Ray T., Pan, David Z.

arXiv.org Artificial IntelligenceDec-28-2023

Photonic computing shows promise for transformative advancements in machine learning (ML) acceleration, offering ultra-fast speed, massive parallelism, and high energy efficiency. However, current photonic tensor core (PTC) designs based on standard optical components hinder scalability and compute density due to their large spatial footprint. To address this, we propose an ultra-compact PTC using customized programmable multi-operand multimode interference (MOMMI) devices, named M3ICRO. The programmable MOMMI leverages the intrinsic light propagation principle, providing a single-device programmable matrix unit beyond the conventional computing paradigm of one multiply-accumulate (MAC) operation per device. To overcome the optimization difficulty of customized devices that often requires time-consuming simulation, we apply ML for optics to predict the device behavior and enable a differentiable optimization flow. We thoroughly investigate the reconfigurability and matrix expressivity of our customized PTC, and introduce a novel block unfolding method to fully exploit the computing capabilities of a complex-valued PTC for near-universal real-valued linear transformations. Extensive evaluations demonstrate that M3ICRO achieves a 3.4-9.6x smaller footprint, 1.6-4.4x higher speed, 10.6-42x higher compute density, 3.7-12x higher system throughput, and superior noise robustness compared to state-of-the-art coherent PTC designs, while maintaining close-to-digital task accuracy across various ML benchmarks. Our code is open-sourced at https://github.com/JeremieMelo/M3ICRO-MOMMI.

expressivity, icro, ptc, (17 more...)

arXiv.org Artificial Intelligence

2305.19505

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
Europe > San Marino > Fiorentino > Fiorentino (0.04)
Asia > Japan > Honshū > Chūgoku > Okayama Prefecture > Okayama (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning Quantum Processes and Hamiltonians via the Pauli Transfer Matrix

Caro, Matthias C.

arXiv.org Artificial IntelligenceJul-16-2023

Learning about physical systems from quantum-enhanced experiments, relying on a quantum memory and quantum processing, can outperform learning from experiments in which only classical memory and processing are available. Whereas quantum advantages have been established for a variety of state learning tasks, quantum process learning allows for comparable advantages only with a careful problem formulation and is less understood. We establish an exponential quantum advantage for learning an unknown $n$-qubit quantum process $\mathcal{N}$. We show that a quantum memory allows to efficiently solve the following tasks: (a) learning the Pauli transfer matrix of an arbitrary $\mathcal{N}$, (b) predicting expectation values of bounded Pauli-sparse observables measured on the output of an arbitrary $\mathcal{N}$ upon input of a Pauli-sparse state, and (c) predicting expectation values of arbitrary bounded observables measured on the output of an unknown $\mathcal{N}$ with sparse Pauli transfer matrix upon input of an arbitrary state. With quantum memory, these tasks can be solved using linearly-in-$n$ many copies of the Choi state of $\mathcal{N}$, and even time-efficiently in the case of (b). In contrast, any learner without quantum memory requires exponentially-in-$n$ many queries, even when querying $\mathcal{N}$ on subsystems of adaptively chosen states and performing adaptively chosen measurements. In proving this separation, we extend existing shadow tomography upper and lower bounds from states to channels via the Choi-Jamiolkowski isomorphism. Moreover, we combine Pauli transfer matrix learning with polynomial interpolation techniques to develop a procedure for learning arbitrary Hamiltonians, which may have non-local all-to-all interactions, from short-time dynamics. Our results highlight the power of quantum-enhanced experiments for learning highly complex quantum dynamics.

artificial intelligence, machine learning, quantum memory, (15 more...)

arXiv.org Artificial Intelligence

2212.04471

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Sample, computation vs storage tradeoffs for classification using tensor subspace models

Chaghazardi, Mohammadhossein, Aeron, Shuchin

arXiv.org Machine LearningJun-27-2017

Our main tool is the use of tensor subspaces, i.e. subspaces with a Kronecker structure, for embedding the data into lower dimensions. Among the subspaces with a Kronecker structure, we show that using subspaces with a hierarchical structure for representing data leads to improved tradeoffs. One of the main reasons for the improvement is that embedding data into these hierarchical Kronecker structured subspaces prevents overfitting at higher latent dimensions.

artificial intelligence, machine learning, subspace, (16 more...)

arXiv.org Machine Learning

1706.05599

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Knowledge Graph Completion with Adaptive Sparse Transfer Matrix

Ji, Guoliang (Institute of Automation, Chinese Academy of Sciences) | Liu, Kang (Institute of Automation, Chinese Academy of Sciences) | He, Shizhu (Institute of Automation, Chinese Academy of Sciences) | Zhao, Jun (Institute of Automation, Chinese Academy of Sciences)

AAAI ConferencesApr-19-2016

We model knowledge graphs for their completion by encoding each entity and relation into a numerical space. All previous work including Trans(E, H, R, and D) ignore the heterogeneity (some relations link many entity pairs and others do not) and the imbalance (the number of head entities and that of tail entities in a relation could be different) of knowledge graphs. In this paper, we propose a novel approach TranSparse to deal with the two issues. In TranSparse, transfer matrices are replaced by adaptive sparse matrices, whose sparse degrees are determined by the number of entities (or entity pairs) linked by relations. In experiments, we design structured and unstructured sparse patterns for transfer matrices and analyze their advantages and disadvantages. We evaluate our approach on triplet classification and link prediction tasks. Experimental results show that TranSparse outperforms Trans(E, H, R, and D) significantly, and achieves state-of-the-art performance.

artificial intelligence, machine learning, relation, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.93)

Add feedback

A Bayesian Approach to Sparse plus Low rank Network Identification

Zorzi, Mattia, Chiuso, Alessandro

arXiv.org Machine LearningSep-26-2015

We consider the problem of modeling multivariate time series with parsimonious dynamical models which can be represented as sparse dynamic Bayesian networks with few latent nodes. This structure translates into a sparse plus low rank model. In this paper, we propose a Gaussian regression approach to identify such a model.

artificial intelligence, machine learning, posterior mean, (18 more...)

arXiv.org Machine Learning

1503.0734

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

COT: Contextual Operating Tensor for Context-Aware Recommender Systems

Liu, Qiang (Institute of Automation, Chinese Academy of Sciences) | Wu, Shu (Institute of Automation, Chinese Academy of Sciences) | Wang, Liang (Institute of Automation, Chinese Academy of Sciences)

AAAI ConferencesMar-6-2015

With rapid growth of information on the internet, recommender systems become fundamental for helping users alleviate the problem of information overload. Since contextual information can be used as a significant factor in modeling user behavior, various context-aware recommendation methods are proposed. However, the state-of-the-art context modeling methods treat contexts as other dimensions similar to the dimensions of users and items, and cannot capture the special semantic operation of contexts. On the other hand, some works on multi-domain relation prediction can be used for the context-aware recommendation, but they have problems in generating recommendation under a large amount of contextual information. In this work, we propose Contextual Operating Tensor (COT) model, which represents the common semantic effects of contexts as a contextual operating tensor and represents a context as a latent vector. Then, to model the semantic operation of a context combination, we generate contextual operating matrix from the contextual operating tensor and latent vectors of contexts. Thus latent vectors of users and items can be operated by the contextual operating matrices. Experimental results show that the proposed COT model yields significant improvements over the competitive compared methods on three typical datasets, i.e., Food, Adom and Movielens-1M datasets.

contextual information, dataset, latent vector, (14 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Media > Film (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback